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SHORT GCG EXPANSIONS IN THE PAB J J GENE FOR OCULO- 
PHARYNGEAL MUSCULAR DYSTROPHY AND DIAGNOSTIC THEREOF 

BACKGROUND OF THE INVENTION 

5 (a) Field of the Invention 

The invention relates to PAB II gerie, and its 
uses thereof for the diagnosis, prognosis and treatment 
of a disease related with protein accumulation in 
nucleus, such as oculopharyngeal muscular dystrophy. 
10 (b) Description of Prior Art 

Autosomal dominant oculopharyngeal muscular dys- 
trophy (OPMD) is an adult-onset disease with a world- 
wide distribution. It usually presents in the sixth 
decade with progressive swallowing difficulties (dys- 
15 phagia) , eye lid drooping (ptosis) and proximal limb 
weakness. Unique nuclear filament inclusions in skele- 
tal muscle fibers are its pathological hallmark (Tome, 
F.M.S. Sc Fardeau, Acta Neuropath. 49/ 85-87 (1980)). We 
isolated the poly (A) binding protein II (PAB II) gene 
20 from a 217 kb candidate interval in chromosome 14qll. A 
(GCG) 6 repeat encoding a polyalanine tract located at 
the N-terminus of the protein was expanded to (GCG) 8-13 
in the 144 OPMD families screened. More severe pheno- 
types were observed in compound heterozygotes for the 
25 (GCG) 9 mutation and a (GCG) 7 allele found in 2% of the 
population, whereas homozygosity for the (GCG) 7 allele 
leads to autosomal recessive OPMD. Thus the (GCG) 7 
allele is an example of a polymorphism which can act as 
either a modifier of a dominant phenotype or as a 
30 recessive mutation. Pathological expansions of the 
polyalanine tract may cause mutated PAB II oligomers to 
accumulate as filament inclusions in nuclei. 

It would be highly desirable to be provided with 
a tool for the diagnosis, prognosis and treatment of a 
35 disease related with polyalanine accumulation in 
nucleus, such as oculopharyngeal muscular dystrophy. 
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SUMMARY OF THE INVENTION 

One aim of the present invention is to provide a 
tool for the diagnosis, prognosis and treatment of a 
disease related with polyalanine accumulation in 
nucleus, such as oculopharyngeal muscular dystrophy. 

In accordance with the present invention there 
is provided a human PAB II gene containing transcribed 
polymorphic GCG repeat, which comprises a sequence as 
set forth in Fig. 4, which includes introns and flank- 
ing genomic sequence . 

The allelic variants of GCG repeat of the human 
PAB II gene are associated with a disease related with 
protein accumulation in nucleus, such as polyalanine 
accumulation, or with a disease related with swallowing 
difficulties, such as oculopharyngeal muscular dystro- 
phy. 

In accordance with the present invention there 
is also provided a method for the diagnosis of a dis- 
ease with protein accumulation in nucleus, which com- 
prises the steps of: 

a) obtaining a nucleic acid sample of said patient; 
and 

b) determining allelic variants of GCG repeat of 
the gene of the human PAB II gene, and wherein 
long allelic variants are indicative of a dis- 
ease related with protein accumulation in 
nucleus, such as polyalanine accumulation and 
oculopharyngeal muscular dystrophy . 

The long allelic variants have from about 245 to 
about 2 63 bp in length. 

In accordance with the present invention there 
is also provided a non-human mammal model for the PAB 
II gene of the human PAB II gene, whose germ cells and 
somatic cells are modified to express at least one 
allelic variant of the PAB II gene and wherein said 
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allelic variant of the PAB II being introduced into the 
mammal, or an ancestor of the mammal, at an embryonic 
stage . 

In accordance with the present invention there 
is also provided a method for the screening of thera- 
peutic agents for the prevention and/or treatment of 
oculopharyngeal muscular dystrophy, which comprises the 
steps of: 

a) administering said therapeutic agents to the 
non-human mammal of the present invention or 
oculopharyngeal muscular dystrophy patients; and 

b) evaluating the prevention and/or treatment of 
development of oculopharyngeal muscular dystro- 
phy in said mammal or said patients. 

In accordance with the present invention there 
is also provided a method to identify genes part of or 
interacting with a biochemical pathway affected by PAB 
II gene, which comprises the steps of: 

a) designing probes and/or primers using the hGTl 
gene of the PAB II gene and screening oculopha- 
ryngeal muscular dystrophy patients samples with 
said probes and/or primers; and 

b) evaluating the identified gene role in oculopha- 
ryngeal muscular dystrophy patients. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figs. 1A-B illustrate the positional cloning of 
the PAB II gene; 

Figs. 2A-G illustrate the OPMD (GCG) n expansion 

sizes and sequence of mutations (SEQ ID NOS:l-2); 

Fig. 3 illustrates the age distribution of swal- 
lowing time (st) for French Canadian OPMD carriers of 
the (GCG) 9 mutation; and 
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Fig. 4 illustrates the nucleotide sequence of 
human poly (A) binding protein II (hPAB II) (SEQ ID 
NO: 3) . 

5 DETAILED DESCRIPTION OF THE INVENTION 

In order to identify the gene mutated in OPMD, 
we constructed a 350 kb cosmid contig between flanking 
markers D14S990 and D14S1457 (Fig. 1A) . Positions of 
the PAB II selected cDNA clones in relation to the 

10 EcoRI restriction map and the Genealogy-based Estimate 
of Historical Meiosis (GEHM) -derived candidate interval 
(Rommens, J.M. et al . , in Proceedings of the third 
international workshop on the identification of tran- 
scribed sequences (eds. Hochgeschwender, U. & Gardiner, 

15 K.) 65-79 (Plenum, New York, 1994)). 

The human poly (A) binding protein II gene (PAB 
II) is encoded by the nucleotide sequence as set forth 
in Fig . 4 . 

Twenty- five cDNAs were isolated by cDNA selec- 

20 tion from the candidate interval (Rommens, J.M. et al . , 
in Proceedings of the third international workshop on 
the identification of transcribed sequences (eds. 
Hochgeschwender, U. & Gardiner, K. ) 65-79 (Plenum, New 
York, 1994) ) . Three of these hybridized to a common 2 0 

25 kb EcoRI restriction fragment and showed high sequence 
homology to the bovine poly (A) binding protein II 
gene(bPAB II) (Fig. 1A) . The PAB II gene appeared to be 
a good candidate for OPMD because it mapped to the 
genetically defined 0.26 cM candidate interval in 14qll 

30 (Fig. 1A) , its mRNA showed a high level of expression 
in skeletal muscle, and the PAB II protein is exclu- 
sively localized to the nucleus (Krause, S. et al . , 
Exp. Cell Res. 214, 75-82 (1994)) where it acts as a 
factor in mRNA polyadenylation (Whale, E., Cell 66, 

35 759-768 (1991); Whale, E. et al . , J . Biol. Chem. 268., 
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2937-2945 (1993); Bienroth, S. et al . , EMBO J. 12, 585- 
594 (1993) ) . 

We subcloned a 8 kb Hindi I I genomic fragment 
containing the PAB II gene, and sequenced 6 0 02 bp 
5 (GenBank: AF026029) (Nemeth, A. et al . , Nucleic Acids 
Res. 23, 4034-4041 (1995)) (Fig. IB). Genomic structure 
of the PAB II gene, and position of the OPMD (GCG) n 
expansions. Exons are numbered. Introns 1 and 6 are 
variably present in 60% of cDNA clones. ORF, open read- 
10 ing frame; cen, centromere and tel , telomere. 

The coding sequence was based on the previously 
published bovine sequence (GenBank: X89969) and the 
sequence of 31 human cDNAs and ESTs . The gene is com- 
posed of 7 exons and is transcribed in the cen-qter 
is orientation (Fig. IB). Multiple splice variants are 
found in ESTs and on Northern blots (Nemeth, A. et al . , 
Nucleic Acids Res. 23, 4034-4041 (1995)). In particu- 
lar, introns 1 and 6 are present in more than 60% of 
clones (Fig. IB) ( Nemeth, A. et al . , Nucleic Acids Res. 
20 23., 4034-4041 (1995)). The coding and protein sequences 
are highly conserved between human, bovine and mouse 
(GenBank: U93050). 93% of the PAB II sequence was read- 
ily amenable to RT-PCR- or genomic-SSCP screening. No 
mutations were uncovered using both techniques. How- 
25 ever, a 400 bp region of exon 1 containing the start 
codon could not be readily amplified. This region is 
80% GC rich. It includes a (GCG) 6 repeat which codes 
for the first six alanines of a homopolymeric stretch 
of 10 (Fig. 2G) . Nucleotide sequence of the mutated 
30 region of PAB II. Amino acid sequences of the N- termi- 
nus polyalanine stretch and position of the OPMD ala- 
nine insertions. 

Special conditions were designed to amplify by 
PCR a 242 bp genomic fragment including this GCG- 
35 repeat. The {GCG) 6 allele was found in 98% of French 
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Canadian non-OPMD control chromosomes, whereas 2% of 
chromosomes carried a (GCG) 7 polymorphism (n=86) 
(Brais, B. et al . , Hum. Mol . Genet. 4, 429-434 (1995)). 

Screening OPMD cases belonging to 144 families 
showed in all cases a PCR product larger by 6 to 21 bp 
than that found in controls (Fig. 2A) . (GCG) s normal 
allele (N) and the six different (GCG) n expansions 
observed in 144 families. 

Sequencing of these fragments revealed that the 
increased sizes were due to expansions of the GCG 
re p ea t (Fig. 2G) . Fig. 2F shows the sequence of the 
(GCG) 9 French Canadian expansion in a heterozygous par- 
ent and his homozygous child. Partial sequence of exon 
1 in a normal (GCG) g control (N) , a heterozygote (ht.) 
and a homozygote (hm.) for the (GCG) 9-repeat mutation. 
The number of families sharing the different (GCG)n- 
repeats expansions is shown in Table 1. 

Table 1 

Number of families sharing the different dominant (GCG) n 



OPMD mutations 


Mutations 


Polyalanine 


Families 


(GCG) g 


12 


4 


(GCG) 9 


13 


99 


(GCG) 10 


14 


19 


(GCG) X1 


15 


16 


(GCG) 12 


16 


5 


(GCG) 13 


17 


1 


Total 




144 



t, 10 alanine residues in normal PAB II. 

The (GCG) 9 expansion shared by 70 French Cana- 
dian families is the most frequent mutation we observed 
5 (Table 1). The (GCG) 9 expansion is quite stable, with a 
single doubling observed in family F151 in an estimated 
598 French Canadian meioses (Fig. 2C) . The doubling of 
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the French Canadian (GCG) 9 expansion is demonstrated in 
Family F151. 

This contrasts with' the unstable nature of pre- 
viously described disease-causing triplet-repeats 
5 (Rosenberg, R.N. , New Eng. J. Med. 335 , 1222-1224 
(1996) ) . 

Genotyping of all the participants in our clini- 
cal study of French Canadian OPMD provided molecular 
insights into the clinical variability observed in this 

10 condition. The genotypes for both copies of the PAB II 
mutated region were added to an anonymous version of 
our clinical database of 176 (GCG) 9 mutation carriers 
(Brais, B. et al . , Hum. Mol . Genet. 4, 429-434 (1995)). 
Severity of the phenotype can be assessed by the swal- 

15 lowing time (st) in seconds taken to drink 80 cc of 
ice-cold water (Brais, B. et al . , Hum. Mol. Genet. 4, 
429-434 (1995); Bouchard, J. -P. et al . , Can. J. Neurol. 
Sci. 19, 296-297 (1992)). The late onset and progres- 
sive nature of the muscular dystrophy is clearly illus- 

20 trated in heterozygous carriers of the (GCG) 9 mutation 
(bold curve in Fig. 3) when compared the average st of 
control (GCG) 6 homozygous part icipants (n=76 , thinner 
line in Fig. 3) . The bold curve represents the average 
OPMD st for carriers of only one copy of the (GCG) 9 

25 mutation (n=169) , while the thinner line corresponds to 
the average st for (GCG) s homozygous normal con- 
trols (n=76). The black dot corresponds to the st value 
for individual VIII. Roman numerals refer to individual 
cases shown in Figs. 2E, 2D and discussed in the text. 

30 Genotype of a homozygous (GCG) 9 case and her parents 
(Fig. 2B) . Independent segregation of the (GCG) 7 
allele . Case V has a more severe OPMD phenotype 
(Fig. 2D) . 

Two groups of genotypically distinct OPMD cases 
35 have more severe swallowing difficulties. Individuals 
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I, II, and III have an early-onset disease and are 
homozygous for the (GCG) 9 expansion (P < 10" 5 ) 
(Figs. 2B, F) . Cases IV, V, VI and VII have more severe 
phenotypes and are compound heterozygotes for the 
(GCG) 9 mutation and the (GCG) 7 polymorphism (P < 10" 5 ) . 
In Fig. 2D the independent segregation of the two 
alleles is shown. Case V, who inherited the French 
Canadian (GCG) 9 mutation and the (GCG) 7 polymorphism, 
is more symptomatic than his brother VIII who carries 
the (GCG) 9 mutation and a normal (GCG) 6 allele 
(Figs. 2D and 3). The (GCG) 7 polymorphism thus appears 
to be a modifier of severity of dominant OPMD . Further- 
more, the (GCG) 7 allele can act as a recessive muta- 
tion. This was documented in the French patient IX who 
inherited two copies of the (GCG) 7 polymorphism and has 
a late-onset autosomal recessive form of OPMD 
(Fig. 2E) . Case IX, who has a recessive form of OPMD, 
is shown to have inherited two copies of the (GCG) 7 
polymorphism . 

This is the first description of short trinu- 
cleotide repeat expansions causing a human disease. The 
addition of only two GCG repeats is sufficient to cause 
dominant OPMD. OPMD expansions do not share the cardi- 
nal features of "dynamic mutations". The GCG expansions 
are not only short they are also meiotically quite sta- 
ble. Furthermore, there is a clear cut-off between the 
normal and abnormal alleles, a single GCG expansion 
causing a recessive phenotype . The PAB II (GCG) 7 allele 
is the first example of a relatively frequent allele 
which can act as either a modifier of a dominant pheno- 
type or as a recessive mutation. This dosage effect is 
reminiscent of the one observed in a homozygote for two 
dominant synpolydactyly mutations. In this case, the 
patient had more severe deformities because she inher- 
ited two duplications causing an expansion in the 
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polyalanine tract of the H0XD13 protein (Akarsu, A.N. 
et al . , Hum. Mol . Genet. 5, 945-952 (1996)). A duplica- 
tion causing a similar polyalanine expansion in the a 
subunit 1 gene of the core-binding transcription factor 
5 (CBFal) has also been found to cause dominant cleido- 
cranial dysplasia (Mundlos, S. et al . , Cell .89, 773-779 
(1997) ) . The mutations in these two rare diseases are 
not triplet-repeats. The are duplications of "cryptic 
repeats" composed of mixed synonymous codons and are 

10 thought to result from unequal crossing over (Warren, 
S.T., Science 275 , 408-409 (1997)). In the case of 
OPMD, slippage during replication causing a reiteration 
of the GCG codon is a more likely mechanism (Wells, 
D.R., J. Biol. Chem. 271 , 2875-2878 (1996)). 

15 Different observations converge to suggest that 

a gain of function of PAB II may cause the accumulation 
of nuclear filaments observed in OPMD (Tome, F.M.S. & 
Fardeau, Acta. Neuropath. 49., 85-87 (1980)). PAB II is 
found mostly in dimeric and oligomeric form (Nemeth, A. 

20 et al . , Nucleic Acids Res. 23, 4034-4041 (1995)). It is 
possible that the polyalanine tract plays a role in 
polymerization. Polyalanine stretches have been found 
in many other nuclear proteins such as the HOX pro- 
teins, but their functions is still unknown (Davies, 

25 S.W. et al., Cell 90., 537-548 (1997)). Alanine is a 
highly hydrophobic amino acid present in the cores of 
proteins. In dragline spider silk, polyalanine 
stretches are thought to form B- sheet structures impor- 
tant in ensuring the fibers' strength (Simmons, A.H. et 

30 al . , Science 271, 84-87 (1996)). Polyalanine oligomers 
have also been shown to be extremely resistant to 
chemical denaturation and enzymatic degradation 
(Forood, B. et al . , Bioch. and Biophy. Res. Com. 211, 
7-13 (1995)). One can speculate that PAB II oligomers 

35 comprised of a sufficient number of mutated molecules 
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might accumulate in the nuclei by forming undegradable 
polyalanine rich macromolecules . The rate of the accu- 
mulation would then depend on the ratio of mutated to 
non-mutated protein. The more severe phenotypes 
observed in homozygotes for the (GCG) 9 mutations and 
compound heterozygotes for the (GCG) 9 mutation and 
(GCG) 7 allele may correspond to the fact that in these 
cases PAB II oligomers are composed only of mutated 
proteins. The ensuing faster filament accumulation 
could cause accelerated cell death. The recent descrip- 
tion of nuclear filament inclusions in Huntington's 
disease, raises the possibility that "nuclear toxicity" 
caused by the accumulation of mutated homopolymeric 
domains is involved in the molecular pathophysiology of 
other triplet-repeat diseases (Davies, S.W. et al . , 
Cell 90., 537-548 (1997); Scherzinger, E. et al . , Cell 
90 , 549-558 (1997); DiFiglia, M. et al . , Science 277 , 
1990-1993 (1997)). Future immunocytochemical and 
expression studies will be able to test this patho- 
physiological hypothesis and provide some insight into 
why certain muscle groups are more affected while all 
tissues express PAB II. 
Methods 

Contig and cDNA selection 

The cosmid contig was constructed by standard 
cosmid walking techniques using a gridded chromosome 
14-specific cosmid library (Evans, G.A. et al . , Gene 
79 , 9-20 (1989)). The cDNA clones were isolated by cDNA 
selection as previously described (Rommens, J.M. et 
al . , in Proceedings of the third international workshop 
on the identification of transcribed sequences (eds. 
Hochgeschwender, U. & Gardiner, K. ) 65-79 (Plenum, New 
York, 1994) ) . 

Cloning of the PAB II gene. Three cDNA clones 
corresponding to PAB II were sequenced (Sequenase, 
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USB) . Clones were verified to map to cosmids by South- 
ern hybridization. The 8 kb Hindi 1 1 restriction frag- 
ment was subcloned from cosmid 166G8 into pBluescriptll 
(SK) (Stratagene) . The clone was sequenced using prim- 

5 ers derived from the bPABII gene and human EST 
sequences. Sequencing of the PAB II introns was done by 
primer walking. 

PAB II mutation screening and sequencing. All 
cases were diagnosed as having OPMD on clinical grounds 

10 (Brais, B. et al . , Hum. Mol . Genet. 4, 429-434 (1995)). 
RT-PCR- and genomic SSCP analyses were done using stan- 
dard protocols (Laf reniere, R.G. et al . , Nat. Genet. 
15 , 298-302 (1997)). The primers used to amplify the 
PAB II mutated region were : 5 1 - CGCAGTGCCCCGCCTTAGA- 3 ' 

15 (SEQ ID NO:4) and 5 ' -ACAAGATGGCGCCGCCGCCCCGGC- 3 1 (SEQ 
ID NO: 5) . PCR reactions were performed in a total 
volume of 15 ml containing: 40 ng of genomic DNA; 1.5 
mg of BSA; 1 mM of each primer; 2 50 mM dCTP and dTTP; 
25 mM dATP; 125 mM of dGTP and 125 mM of 7-deaza-dGTP 

20 (Pharmacia); 7.5% DMSO; 3.75 mCi [35S] dATP, 1.5 unit of 
Taq DNA polymerase and 1.5 mM MgCl 2 (Perkin Elmer) . For 
non-radioactive PCR reactions the [35S]dATP was 
replaced by 225 mM of dATP. The amplification procedure 
consisted of an initial denaturation step at 95°C for 

25 five minutes, followed by 3 5 cycles of denaturation at 
95°C for 15 s, annealing at 70°C for 30 s, elongation 
at 74°C for 30 s and a final elongation at 74°C for 7 
min. Samples were loaded on 5% polyacrylamide denatur- 
ing gels. Following electrophoresis, gels were dried 

3 0 and autoradiographs were obtained. Sizes of the inserts 
were determined by comparing to a standard M13 sequence 
(Sequenase, USB) . Fragments used for sequencing were 
gel -purified. Sequencing of the mutated fragment using 
the Amplicycle kit (Perkin Elmer) was done with the 5'- 
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CGCAGTGCCCCGCCTTAGAGGTG-3 1 (SEQ ID NO : 6 ) primer at an 
elongation temperature of 68°C. 

Stability of (GCG) -repeat expansions. The mei- 
otic stability of the (GCG) 9-repeat was estimated based 
5 on our large French Canadian OPMD cohort. We previously 
established that a single ancestral OPMD carrier chro- 
mosome was introduced in the French Canadian population 
by three sisters in 1648. Seventy of the seventy one 
French Canadian OPMD families tested to date segregate 

10 a (GCG) 9 expansion. However, in family F151, the 
affected brother and sister, despite sharing the French 
Canadian ancestral haplotype, carry a (GCG) 12 expansion 
twice the size of the ancestral (GCG) 9 mutation 
(Fig. 2C) . In our founder effect study, we estimated 

15 that 450 (304-594) historical meioses shaped the 123 
OPMD cases belonging to 42 of the 71 enrolled families. 
Our screening of our full set of participants allowed 
us to identify another 148 (GCG) 9 carrier chromosomes. 
Therefore, we estimate that a single mutation of the 

20 (GCG) 9 expansion has occurred in 598 (452-742) meioses. 

Genotype -phenotype correlations. 176 carriers of 
at least one copy of the (GCG) 9 mutation were examined 
during the early stage of the linkage study. All were 
asked to swallow 80 cc of ice-cold water as rapidly as 

25 possible. Testing was stopped after 60 seconds. The 
swallowing time (st) was validated as a sensitive test 
to identify OPMD cases (Brais, B. et al . , Hum. Mol . 
Genet. 4, 429-434 (1995); Bouchard, J. -P. et al . , Can. 
J". Neurol. Sci . 19., 296-297 (1992)). The st values for 

30 76 (GCG) 6 homozygotes normal controls is illustrated in 
Fig. 3. Analyses of variance were computed by two-way 
ANOVA (SYSTAT package). For the (GCG) 9 homozygotes 
their mean st value was compared to the mean value for 
all (GCG) 9 heterozygotes aged 35-40 (P < 10" 5 ) . For the 
35 (GCG) 9 and (GCG) 7 compound heterozygotes their mean st 
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value was compared to the mean value for all (GCG) 9 
heterozygotes aged 45-65 (P < 10" 5 ) . 

While the invention has been described in con- 
nection with specific embodiments thereof, it will be 
understood that it is capable of further modifications 
and this application is intended to cover any varia- 
tions, uses, or adaptations of the invention following, 
in general, the principles of the invention and 
including such departures from the present disclosure 
as come within known or customary practice within the 
art to which the invention pertains and as may be 
applied to the essential features hereinbefore set 
forth, and as follows in the scope of the appended 
claims . 
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WHAT IS CLAIMED IS; 

1. A human PAB II gene containing transcribed poly- 
morphic GCG repeat, which comprises a sequence as set 
forth in SEQ ID NO : 3 , which includes introns and flank- 
ing genomic sequence. 

2. The gene of claim 1, wherein allelic variants of 
GCG repeat are associated with a disease related with 
protein accumulation in nucleus. 

3. The gene of claim 2, wherein said protein accu- 
mulation is polyalanine accumulation. 

4. The gene of claim 1, wherein allelic variants of 
GCG repeat are associated with a disease related with 
swallowing difficulties. 

5. The gene of claim 1, wherein said disease is 
oculopharyngeal muscular dystrophy. 

6. A method for the diagnosis of a disease with 
protein accumulation in nucleus, which comprises the 
steps of : 

a) obtaining a nucleic acid sample of said patient; 
and 

b) determining allelic variants of GCG repeat of 
the gene of claim 1, and wherein long allelic 
variants are indicative of a disease related 
with protein accumulation in nucleus. 



7. The method of claim 6, wherein said disease i 

oculopharyngeal muscular dystrophy. 



s 
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8. The method of claim 7, wherein said long allelic 
variants have from about 245 to about 263 bp in length. 

9. A non-human mammal model for the PAB II gene of 
claim 1, whose germ cells and somatic cells are modi- 
fied to express at least one allelic variant of the PAB 
II gene and wherein said allelic variant of the PAB II 
being introduced into the mammal, or an ancestor of the 
mammal, at an embryonic stage. 

10. A method for the screening of therapeutic agents 
for the prevention and/or treatment of oculopharyngeal 
muscular dystrophy, which comprises the steps of: 

a) administering said therapeutic agents to the 
non-human mammal of claim 9 or oculopharyngeal 
muscular dystrophy patients; and 

b) evaluating the prevention and/or treatment of 
development of oculopharyngeal muscular dystro- 
phy in said mammal or said patients. 

11. A method to identify genes part of or interact- 

ing with a biochemical pathway affected by PAB II gene, 
which comprises the steps of: 

a) designing probes and/or primers using the hGTl 
gene of claim 1 and screening oculopharyngeal 
muscular dystrophy patients samples with said 
probes and/or primers; and 

b) evaluating the identified gene role in oculopha- 
ryngeal muscular dystrophy patients. 
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CP 4-3 



O 4-3 

O CP 

O 4-3 

O CP 
CP 05 
4-3 CP 
O -P 

O 4-> 
O 4-3 
4-3 4-3 

O CP 
CP 4-J 
CP O 



4-3 4-> 05 

CP 4-3 4-3 

4-3 4-> 4-3 

CP 4-3 4-3 



03 
O 
O 
4-3 
O 



CP 4-3 
4-3 O 
CP O 
CP O 
CP 4-3 



4-3 
-P 
4~> 
4-3 
4-3 
05 
4-3 
4-) 
4~> 



4-> 

4-> 

4-3 

4-3 
CP 

4-3 

4-3 
CP 
05 



CP CP 4-> CP 05 



o 
o 

4-3 

o 

4-3 
4-3 
4-> 



4-3 
4-3 
4-> 
4-> 
-P 
O 
4-» 



CP 05 4-3 

O 4-3 4-3 

CP 4-3 -P 

03 4-3 4-> 

O CP O 

O CP o 

O CP CP 

4-> CP 4-3 
O CP 4-3 
CP CP 4-3 



4-3 05 
4-3 4-> 
4-3 05 
4-> 05 

05 CP 

CP 4-3 

CP 05 
4-3 O 

CP 4-3 05 

CP CP 03 
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SEQUENCE LISTING 

<110> McGILL UNIVERSITY 
ROULEAU, Guy A. 
BRAIS , Bernard 

<12 0> SHORT GCG EXPANSIONS IN THE PAG II GENE 

FOR OCULOPHARYNGEAL MUSCULAR DYSTROPHY AND DIAGNOSTIC 
THEREOF 

<130> 1770-199PCT FC/ld 

<150> CA 2, 218, 199 
<151> 1997-12-09 

<160> 6 

<170> FastSEQ for Windows Version 3.0 

<210> 1 
<211> 57 
<212> DNA 

<213> Artificial Sequence 
<400> 1 

atggcggcgg cggcggcggc ggcagcagca atggcggcgg cggcggcggc ggcggca 57 

<210> 2 
<211> 35 
<212> DNA 

<213> Artificial Sequence 
<400> 2 

maaaaaaaaa agaaggrgsm aaaaaaaaaa agaag 3 5 

<210> 3 
<211> 6002 
<212> DNA 

<213> Artificial Sequence 
<400> 3 

aatgaaggtg gacacccaaa tagccccaat acaaatgcct gttcaatcaa ccaaacatct 60 

aagcagcaca tctatgtggt agcatattgc caggccgtga gactgcgaat ataaatagga 12 0 

accgcccctc atctgcaggc gctcacaacc tagttagcaa acagtaaaac aattaagcgc 18 0 

gccgtggaca taggcccact tgtcctggga aatgagggga agctggggtt tgcagtggtt 24 0 

tgattgaagg gggactacat gttagaggca cagactgggt gcaggtacac ccaaaggaac 300 

gagaagagtg gaaggaaaca acatccacaa agtaaccaca tgctggcgta tcgaaggccg 3 60 

tgatttacgg ttttgagact ttacctcgcc agcaaagggg ggccagtctg ttagcggtgc 42 0 

agattggagg ggtgacattg gaagctgtcc aggaaaaaga aaatggaact ggggagcaga 4 80 

aggcctacgc aagagggcgg gacagacagg acttgtgact agtagctctg gactgaggaa 54 0 

tcctccctgc tttctggtgc gggagagcta gtggatgatg gtgccaataa cctggatggg 600 

gaaagtaagc tccctcctgg aatgcttcat tcacaacctc cattttcagc aacatcccat 660 

ctactggtgc ttcctggtcg agatacaagt ttcctgaaac tgctgctctg ttttgggcct 72 0 

cacccggcca acagctcact agctggcaag cagtagtatc aagatggcgg ccccctagga 78 0 
ctggctagtc atgtgacctc gggtttccca agtttgaagc ccggcagtcc tttcgggggc . 84 0 

aaggttcacc tgtcacgaaa cgagtgtcac cccttcgact ctcgcaagcc aatcggcatc 900 

tgagactggg ccactgcggt gaggcgatcg gaagattggt cctttccagt cgcctagcta 96 0 
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gggccaatca 
aatatcgtca 
ttgattgaca 
aggtgcgctt 
ggcaggcagc 
gcgggcccca 
tgcgggcggt 
ggaggccggg 
ggaactggag 
gccgccccgg 
cggcagccaa 
cattgaggac 
gaggcccaga 
tggctggggc 
tgattcgggc 
gaccctcgca 
cagggcacag 
ttaggagctg 
aaaggagcta 
tgagtaactg 
atggggaatg 
tgtgccgcgg 
ttggtggtgg 
tagtctacgt 
atagtttttc 
tttccccttt 
aagcatttca 
tactgtttta 
tatgcactag 
tttgccttca 
tagaaaacaa 
tcatgtccat 
cgtactgggg 
tcctgaaaga 
ggttaaaggt 
attaattcct 
ctcgggaggg 
atcaggtgga 
cagtcaaccg 
aggggagtaa 
ttgagcagta 
taatgttgat 
aggctagatg 
tatggcatat 
ggaaatttta 
acaaataaaa 
gacctgtgag 
atatagagtt 
ttagaggaag 
ctccaggttg 
accaacagac 
cggaccacca 
cggggtcgcg 
gccccgtatg 
gtcttcagga 
agaaggcagc 
cttggttcaa 



cggagcgtcc 
cagcgtggcg 
ggcagatttc 
atttgattgc 
ttgagctaat 
gtctgagcgg 
cggggctccg 
gagggggccc 
cctgaggagc 
ccccgcgccc 
gaggaggagg 
ccggtgagga 
gctcgggcga 
gggtcgggcc 
gtcacgggtg 
tggggcgagg 
cccctgcgtt 
gaagctatca 
cagaacgagg 
gcggttgcac 
tggggttaga 
tcatagtccg 
tagccttgtg 
ctatctttct 
ctccaattgg 
aaattctaga 
accaaagcca 
agtgtgtatt 
gcactattct 
ctgagcttat 
gtgtgtggtt 
tgaggagaag 
ctctgactgg 
acatctccgg 
aatggaatga 
caaattacca 
ttcttttgag 
ctatggtgca 
tgttaccata 
gttgagataa 
agttatttgg 
atatcaggag 
tgggtgggat 
ggaaattcag 
aaatttaaat 
aatataactg 
gtatttgtaa 
ctcagacaaa 
gcaaatcaag 
cctttaaggc 
caggcatcag 
actacaacag 
tctacaggtc 
cttcctcctc 
actttgtctc 
ctcatcatct 
agaggcttcc 



catacttcgc 
gtattattac 
cctaccggga 
caagtaatat 
gagtcctccg 
cgatggcggc 
ggccggggcg 
cggggggcgc 
tgctgctgga 
ccccgggagc 
aggagccggg 
aggagggcga 
gcggtggcag 
ggggatgggt 
cctagtgttg 
gaaatggccg 
ggttcctctt 
aagctcgagt 
tagagaagca 
gcggagcccg 
tactcggcac 
ttgtgtgttc 
cctccctttg 
ttggtagagg 
agacgcttta 
aatgtggagt 
ttcattaggg 
aattctttca 
cggcttgtgg 
gggatagtgc 
tttgtaaaaa 
atggaggctg 
ggttgggggc 
gatagatgtg 
tcagtaatca 
gatttcatgt 
acaggaattt 
acagcagaag 
ctgtgtgaca 
tttaaattac 
tgttaacaca 
ttgcacctaa 
tacgaactag 
gccctgtgtg 
gatttcgaat 
cattgtagcc 
cctcagagag 
gagtcagtga 
gtaagcctat 
tatcatttgt 
cacaacagac 
ctcccgctct 
aggatagatg 
tctggtctga 
ctgcctgtgc 
tttctgcagt 
acccccagcc 



gggcccgccc 
ctaaggactc 
tttgagaatt 
tccccaatgg 
tggccggcgc 
ggcggcggcg 
gcggcgccat 
aggggactac 
gcccgagccg 
tccgggccct 
actggtcgag 
gcgagcaggc 
gcggggggtg 
cagcgatcac 
ttctagagag 
agcatggctg 
aagctgtcct 
cagggagatg 
gatgaatatg 
ggttctcggg 
cctggagctg 
ctctgacctt 
tcctgttata 
ttgcgtgctc 
ggattctaag 
ctcagcccac 
atttgatttg 
atttatcgaa 
gtacagcagg 
tggtggtgga 
attatttttt 
atgcccgttc 
aagttcttct 
gttttgggtg 
gcaaaggctc 
gctttggtgt 
gcctggtgcc 
agctggaagc 
aatttagtgg 
agtgtacaaa 
ggtgatctgt 
atgtcttcag 
aaggggaggg 
tcttattttt 
gattgaaatt 
caaaacgaag 
atacaatgac 
ggacttcctt 
gtccattgct 
tcatctctga 
cggggttttc 
cgattctaca 
ggctgctcct 
ggaacctccc 
aggttgagga 
agaaattggt 
ttttttttct 



gtaggccggg 
gataggaggt 
tggcgcagtg 
agtactagct 
agctctccac 
gcggcagcag 
cttgtgcccg 
gggaacggcc 
gagcccgagc 
gggcctggtt 
ggtgacccgg 
cggcggctgg 
gggttgggcg 
tacaaggggc 
ggtagctttt 
aggcgcgctc 
ccataccctc 
gaggaagaag 
agtccacctc 
ttggaagggt 
cttgtctgag 
tgtgaggcag 
attgtgttgc 
gcatttgacc 
agaaagcaag 
ttaattttgc 
gagggcagga 
ttatttagtg 
gaacagcaca 
agtgcaacat 
cctgatagct 
catctatgtt 
tttggggaat 
tggagggagt 
tgggtttgga 
atgatggccc 
tgtgaaattt 
tcactttcat 
ccatcccaaa 
tagataaatt 
gtcatttaag 
aggccagata 
gcagcttcta 
acaaatttca 
ttccatttag 
catgcctgca 
aattcttttc 
ggccttagat 
gttctagttg 
ctcaggtgat 
cacgagcccg 
gtggttttaa 
ctttcccccg 
tccccccacc 
aggtagttgc 
gataagggct 
tgggagttgg 



gagaagcagg 
gggacgcgtg 
cccgccttag 
catggtgacg 
atgccgggcg 
cagcgggggc 
gggccggtgg 
tggagtctga 
ccgaagagga 
cgggagcccc 
gggacggcgc 
cgcgtcactg 
gggaataacg 
ccgactggct 
cttttatcac 
tggccgagag 
cccacttata 
ctgagaagct 
caggcaatgc 
tgtggggagg 
ctattatgac 
aactgatatt 
tctttattct 
ttcaaatcta 
ctggaagggg 
tcactcttaa 
gggattccta 
agtaacctgc 
gaccaaaatc 
attggtcaag 
ggcccggtga 
ggcaatgtga 
tatttaatag 
gtgggaagga 
aggaaaagag 
agaccaaagg 
ttctcctctc 
ggctgtggtt 
ggtaaagtaa 
atgttttata 
atcatggcat 
acaaaaatga 
cttggcctat 
aagagtagct 
aagaattttg 
ggttgaattt 
aggtttgcgt 
gagtccctat 
tgtataaact 
cccaaaacga 
ctaccgcgcc 
cagcaggccc 
cctcccgtga 
cctccccgtg 
aggccaggcc 
gcatccctcc 
tggcatttga 



1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 
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aggtgtttgc 
gacatttgtt 
ataccagagg 
attctttatt 
ggtttttgcc 
cttggctttt 
aggccctagg 
ccttgcccct 
tgttaagccc 
ggggtcggtt 
gcaggggcct 
tctccttctt 
aaagtgtgta 
taaaaaaaaa 
aaaagatata 
ttggggagta 
tcgccatgga 
ccccttgggc 
ggctctggaa 
tttcacagtc 
tttctccggc 
tctttttctg 
gctcccagcg 

ggggggttta 

tttgcctttt 
ggtggatttt 
gtcatgaata 
aa 



ggacaaaact 
acttttttcg 
ctagctagtt 
tgagccagtc 
ttgcttcact 
cataagctct 
gtttaaaaac 
gtctctcact 
ccctccccct 
ttaggacact 
acggggaggg 
tcttccaggg 
ttaggaggag 
aaaaagaaaa 
ctgtggaagg 
ggggaaggcc 
cacgtctcaa 
ctgctcaagg 
ggacaccaaa 
ccctcctgcc 
tccctgcccc 
ttttgagtgt 
gctccagtgt 
ggggtgtttt 
ttccctttta 
gtttattttt 
aagttgtttt 



gggaggaaca 
gagttaggga 
gatcctccca 
ttgcaaggtt 
tctgtctcta 
acctgcctat 
tgtggaggac 
cagatgcgct 
gccccagttc 
tgaacacttc 
gcttgtactg 
gccgggctag 
agagaggaaa 
acagaagatg 
ggggagaatc 
cagggagtgg 
ctgcgcaagc 
gtaggtgggc 
ctgttctgct 
tgctcctgtc 
tccagattgc 
ctttctttgc 
aaattcccct 
tgtttttcag 
tttggaggga 
ttagctcatt 
tgaaaataaa 



gggcctccag 
gggattgaag 
acagccttgt 
aacttctcac 
catttaaata 
ccccaggagt 
tgaaaaactg 
tctttttcgc 
tcccaggtgc 
ttttcccccc 
aactatctag 
agcgacatca 
aaaagaggaa 
accttgatgg 
ccataactaa 
ggcagggggc 
tgcttgccca 
gtgggtggta 
tgttaccttc 
cagccaggtc 
ctggtgatct 
aggtttctgt 
tccccctggg 
ttgttttgtt 
atgggaggaa 
tccaggggtg 
aaaaaaaaaa 



gaagttgaaa 
actgaacctc 
gggaggattt 
tgggcctagt 
gacgggttag 
tagggaggat 
gataaaaagg 
cactgtttgg 
gttactattt 
ttcccttcac 
tgatcacgtt 
tggtattccc 
agaaggaaaa 
aaaaaaaata 
ctgctgagga 
tgcttattca 
tgtttccctg 
ggagggtttt 
cctcccgtct 
taccacccac 
attttgtttc 
agccggaaga 
gaaatgcact 
tttttgtttt 
gtgggaacag 
ggaatttttt 
aaaaaaaaaa 



gcactgcttg 
ccttggaaga 
tgagatactt 
gtggtnccca 
gcatataaac 
ctatttgtga 
gggtcctttt 
caaagttttc 
ctgggatcat 
agtaactggg 
aacacctaac 
cttactaaaa 
aaaaaagaat 
ttttttaaaa 
gggacctgct 
ctctggggat 
cccccttcac 
ttttacccag 
tctcctcgcc 
cccacccctc 
cttttgtgtt 
tctccgttcc 
accttgtttt 
ttttttttcc 
ggaggtggga 
tttaatatgt 
aaaaaaaaaa 



4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6002 



<210> 4 
<211> 19 
<212> DNA 

<213> Artificial Sequence 

<400> 4 
cgcagtgccc cgccttaga 

<210> 5 

<211> 24 

<212> DNA 

<213> Artificial Sequence 

<400> 5 
acaagatggc gccgccgccc cggc 

<210> 6 

<211> 23 

<212> DNA 

<213> Artificial Sequence 

<400> 6 
cgcagtgccc cgccttagag gtg 
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